Human annotation of lexical chains: coverage and agreement measures

نویسندگان

  • Bill Hollingsworth
  • Simone Teufel
چکیده

Lexical chains have been successfully used in several previous applications, e.g. topic segmentation and summarization. In this paper, we address the problem of how to directly evaluate the quality of lexical chains, in comparison to a human gold standard. This is in contrast to previous work, where the formal evaluation either relied on a word sense disambiguation task or concentrated on the final application result (the summary or the text segmentation), rather than the lexical chains themselves. We present a small user study of human annotation of lexical chains, and a set of measures to measure how much agreement between sets of lexical chains there is. We also perform a small metaevaluation to compare the best of these metrics, a partial overlap measure, to rankings of chains derived by introspection, which shows that our measure agrees reasonably well with intuition. We also describe our algorithm for chain creation, which varies from previous work in several aspects (for instance the fact that it allows for adjective attribution), and report its agreement with our human annotators in terms of our new measure.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Coarse Lexical Semantic Annotation with Supersenses: An Arabic Case Study

“Lightweight” semantic annotation of text calls for a simple representation, ideally without requiring a semantic lexicon to achieve good coverage in the language and domain. In this paper, we repurpose WordNet’s supersense tags for annotation, developing specific guidelines for nominal expressions and applying them to Arabic Wikipedia articles in four topical domains. The resulting corpus has ...

متن کامل

The Relationship between Iranian EFL Learners' Reading Comprehension, Vocabulary Size and Lexical Coverage of the Text: The Case of Narrative and Argumentative Genres

This study explored the relationship between EFL learners’ vocabulary size, lexical coverage of the text and reading comprehension texts (narrative & argumentative genres). To this end, 120 male and female out of 180 students studying at Talesh Azad University were selected based on their performance on the Nelson Proficiency Test. A Nelson reading proficiency test was also administered in orde...

متن کامل

Three Knowledge-Free Methods for Automatic Lexical Chain Extraction

We present three approaches to lexical chaining based on the LDA topic model and evaluate them intrinsically on a manually annotated set of German documents. After motivating the choice of statistical methods for lexical chaining with their adaptability to different languages and subject domains, we describe our new two-level chain annotation scheme, which rooted in the concept of cohesive harm...

متن کامل

The Comparative Impact of Pictorial Annotations and Morphological Instruction on Lexical Inferencing of Iranian Intermediate EFL Learners

One of the main ways to acquire unfamiliar words is to make guesses about words meaning. This study investigates the comparative effects of pictorial annotations and morphological instructions on Iranian EFL learners’ lexical inferencing ability. Considering homogeneity issues using PET (Preliminary English Test), the researchers assigned the participants into two experimental and one control g...

متن کامل

Web-based Annotation of Anaphoric Relations and Lexical Chains

Annotating large text corpora is a timeconsuming effort. Although single-user annotation tools are available, web-based annotation applications allow for distributed annotation and file access from different locations. In this paper we present the webbased annotation application Serengeti for annotating anaphoric relations which will be extended for the annotation of lexical chains.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005